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A METHOD FOR GENE IDENTIFICATION BASED ON DIFFERENTIAL DNA 
METHYL AT I ON 

5 This application claims priority of provisional application 
U.S. Serial No. 60/346,050, filed October 24, 2001, the 
contents of which are incorporated herein by reference. 

The invention described herein was made with government support 
10 under NIH Grant 1 R01-HGO02425-01 . Accordingly, the United 
States government has certain rights in this invention. 

Throughout this application, various references are cited. 
Disclosure of these references in their entirety is hereby 
15 incorporated by reference into this application to more fully 
describe the state of the art to which this invention pertains. 

Background of the Invention 

2 0 The mammalian genome contains approximately 3 x 10 7 5- 

methylcytosine (m 5 C) residues, all or most at 5'-m 5 CpG-3'. 
About 60% of CpG sites are methylated in the DNA of somatic 
cells (Bestor et al . , 1984; Li et al . , 1992). Methylation 
recruits a variety of transcriptional repressors, including 
25 histone deacetylases and other proteins that cause chromosome 
condensation and silencing (Schubeler et al . , 2000; reviewed 
by Bestor, 1998) . 

While it has long been known that methylation of a promoter 

3 0 causes profound silencing if the sequence is rich in CpG 

dinucleotides, only recently have genetic and biochemical 
experiments begun to identify the biological functions of DNA 
methylation after many years of controversy and speculation. 
It was only recently demonstrated that the large majority 
35 (>90%) of m 5 C actually lies within intragenomic parasites such 
as transposons and endogenous retroviruses (which are rich in 
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the CpG dinucleotide and represent more than 45% of the genome; 
Smit, 1999), and it has been hypothesized that the primary 
function of cytosine methylation is host -defense against the 
transcription and dispersal of intragenomic parasites (Bestor, 
5 1990; Bestor and Coxon, 1993; Bestor and Tycko, 1996; Yoder et 
al . , 1997). Allele-specif ic cytosine methylation has been 
shown to be required for the monoallelic expression of some 
imprinted genes . When methylation levels are reduced as a 
result of homozygous targeted loss-of -function mutations in the 
10 Dnmtl gene, which encodes the major DNA methyltransf erase of 
vertebrates (reviewed by Bestor, 2 000) , the imprinted genes 
H19, Igf2 f and Igf2r are expressed at equal rates from both 
parental alleles (Li et al., 1993a; 1993b). 

15 Demethylation of the Xist gene on the X chromosome activates 
Xist transcription and leads to inactivation of both X 
chromosomes in female cells and of the sole X in male cells 
(Panning and Jaenisch, 1996) . Other data have shown that 
demethylation causes fulminating transcription of endogenous 

2 0 retroviral DNA to the point where retroviral transcripts become 

one of the predominant mRNA species of Dnmtl mutant embryos 
(Walsh et al, 1998) . However, Dnmtl mutant embryos do not 
show ectopic or precocious activation of tissue specific genes, 
and in fact the promoters of such genes are not normally 
25 methylated in non- expressing tissues (Walsh and Bestor, 1999) . 
This suggested that DNA methylation might have primary roles 
in processes other than reversible gene regulation during 
development . 

3 0 There has been much controversy over the biological roles of 

cytosine methylation. The biological importance of cytosine 
methylation was long in doubt, in large part because the DNA 
of familiar laboratory organisms (notably yeast, Drosophila, 
and C. elegans) lack modified bases. However, genetic studies 
3 5 in mice and humans have shown that abnormalities of genomic 
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methylation patterns have severe phenotypic consequences. 
Disruption of the Dnmtl gene (Bestor et al . , 1988) showed that 
demethylation of the genome caused apoptotic cell death in all 
differentiating cell types, fulminating expression of normally 
5 silenced retroposons, loss of imprinted expression at a number 
of imprinted loci, ectopic X inactivation, and marked 
chromosome instability manifested as a high rate of deletions 
and rearrangements (reviewed by Bestor, 2000) • 

10 Human genetic disorders were recently shown to be caused by 
mutations in a DNA methyltransf erase gene (Xu et al . , 1999) and 
in a gene that encodes a protein that binds to methylated DNA 
(Amir et al . , 1999). The first of these, ICF syndrome, is 
characterized by immunodeficiency, centromere instability, and 

15 facial anomalies. The cytogenetic abnormalities are extreme ; 
chromosomes 1, 9, and 16 gain and lose short arms such that a 
single chromosome can have as many as 12 short arms. The 
resulting pinwheel chromosomes are highly diagnostic. The 
breakage and rejoining occurs at tracts of classical satellite 

2 0 DNA, which is normally heavily methylated but is completely 
unmethylated in DNA of ICF patients. It has been shown that 
ICF syndrome is due to inactivating point mutations in the 
DNMT3B gene on chromosome 20 (Xu et al . , 1999) . 

25 The second syndrome, Rett syndrome, is a common 
neurodevelopmental syndrome in which normal early development 
is followed by a regression in all neural functions leading to 
complete apraxia and death by aspiration pneumonia or heart 
failure. The syndrome is due to mutations in MeCP2, which 

30 encodes a transcriptional repressor that binds specifically to 
methylated DNA (Amir et al., 1999). 

Methylation abnormalities have also been seen in patients 
suffering from ATRX (alpha thalassemia and mental retardation 
35 on the X) syndrome (Gibbons et al . , 2000). The genetic findings 
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in both mice and humans confirm that cytosine methylation has 
multiple essential roles. There is, however, much remaining 
uncertainty and continuing controversy as to the nature of 
. those roles . 

5 

Another aspect of genomic methylation patterns is the frequent 
finding of ectopic de novo methylation of CpG islands 
associated with tumor suppressor genes in human tumors and 
tumor cells lines (reviewed by Warnecke and Bestor, 2 000) . 

10 First observed at RBI, ectopic promoter methylation has come 
to be regarded as a common mechanism by which tumor suppressor 
genes are inactivated in cancer. However, there is little 
direct evidence that the observed methylation is responsible 
for the silencing, and most studies have used DNA from cultured 

15 tumor cell lines in which genomic methylation patterns are very 
unstable. Nonetheless, the high frequency with which promoter 
methylation is observed at tumor suppressor loci indicates -.the 
possibility that this feature can be used to identify candidate 
tumor suppressor genes that might not be identified through 

2 0 other means. 



Given that inherited and somatic changes in methylation 
patterns are involved in human disease, it is unfortunate that 
so little should be known of the basic organization of genomic 

25 methylation patterns. The methylation landscape of the human 
genome, as well as the role of methylation pattern dynamics in 
normal development, carcinogenesis, and human genetic disorders 
remains an important area for exploration. Unfortunately, 
there remains a need for experimental methods suitable for 

30 investigating methylation' s role in the genome. 
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Summary of the Invention 

This invention provides a method for detecting the presence of 
differential methylation between DNA from a first source and 
the corresponding DNA from a second source, which method 
comprises the steps of 

(a) (i) contacting an agent that degrades methylated DNA 
with a DNA sample from the first source, under suitable 
conditions, so as to degrade methylated DNA in the first 
sample, and (ii) contacting an agent that degrades 
unmethylated DNA with a DNA sample from the second 
source, under suitable conditions, so as to degrade 

• unmethylated DNA in the second sample, - 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the DNA 
strands therein, so as to permit the formation of a 
hybrid DNA duplex comprising a DNA strand from the first 
source and a DNA strand from the second source, should 
both such strands be present; and 

(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 
differential methylation between the DNA from the first 
source and the corresponding DNA from the second source. 

This invention also provides a method for determining the 
presence of a tumor suppressor gene in a DNA sample from a 
tumor cell, which method comprises the steps of 

(a) (i) contacting an agent that degrades unmethylated 
DNA with the DNA sample from the tumor cell, under 
suitable conditions, so as to degrade unmethylated DNA in 
the sample, and (ii) contacting an agent that degrades 
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methylated DNA with a DNA sample from a normal cell 
corresponding to the tumor cell, under suitable 

conditions,- so as to degrade methylated DNA in the 
sample; 

5 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the DNA 
strands therein, so as to permit the formation of a 
hybrid DNA duplex comprising a DNA strand from the normal 
10 cell and a DNA strand from the tumor cell, should both 

such strands be present; 



(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 

15 differential methylation between the DNA from the normal 

cell and the corresponding DNA from the tumor cell; and 

(d) determining whether the DNA strand from the tumor 
cell in the hybrid DNA duplex detected in step (c) 

2 0 comprises a tumor suppressor gene, thereby determining 

the presence of a tumor suppressor gene in the DNA sample 
from the tumor cell. 
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Brief Description of the Figures 

Figure 1 

Organization of transposons, exons, and Hpall (CCGG) sites 
5 within the human HPRT gene. Organization of HPRT is typical 
of human genes (Yoder et al . , 1997). CCGG sites located in 
known transposons and in cellular sequences are shown in 
contrasting shades; note the concentration of cellular CCGG 
sites in the CpG island at the 5' end of the gene. Nearly all 

10 of the CCGG sites within the body of the gene are in 
transposons. As shown by the scale at right the gene is 
methylated at these sites and unmethylated at the CpG island, 
as is true of the large majority of cellular genes. The CpG 
island undergoes dense de novo methylation when located on the 

15 inactive X chromosome, but is completely unmethylated on the 
active X (Litt et al . , 1996). CCGG sites are shown here as 
they are most often used to evaluate methylation patterns by 
Southern blot analysis. 

2 0 Figures 2A-2C 

Removal of methylated sequences by McrBC digestion and of 
unmethylated sequences by RE digestion. (2A) Unmethylated S. 
poxnbe DNA was resistant to McrBC digestion (lane 3) ; after 
methylation of all CpG sites by treatment with M.SssI it 
25 became very sensitive (lane 4) . Unmethylated S. pombe DNA is 
very sensitive to RE treatment (lane 5; discrete bands were 
derived from very G + C-poor mitochondrial DNA) . (2B) McrBC- 
resistant fragments in human Jurkat test DNA (lane 2) were 
sensitive to RE treatment, indicating that they were in fact 

3 0 unmethylated in the starting DNA and did contain CpG sites. 

(2C) Methylation of human DNA at all CpG sites with M.SssI 
shows that the McrBC-resistant fraction > 500 bp in lane 5 is 
unmethylated, as shown by the acquisition of McrBC sensitivity 
after M . SssI treatment (1 ane 3 ) . Gap below 500 bp in all 
35 panels is artifact of bromphenol blue. 
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Fiaure 3 

Removal of endogenous methylated sequences from McrBC 
libraries. LINE-1 (LI) elements (left) and satellite 3 DNA 
are normally heavily methylated. The figure shows that these 
5 sequences are largely removed from human DNA by digestion with 
McrBC. The size range is set between the lower limit for CpG 
islands (-500 bp) and the upper limit for clonability in 
plasmid vectors . 
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Detailed Description of the In vention 

Definitions 

5 As used in this application, except as otherwise expressly 
provided herein, each of the following terms shall have the 
meaning set forth below. 

"Normal cell corresponding to a tumor cell" shall mean a non- 
10 diseased cell of the same type as that from which the tumor 
cell originated. 

"Source of DNA" includes, but is not limited to, a normal 
tissue, a diseased tissue, a cell, a virus, and populations 
15 thereof, a biological fluid sample, a cultured cell or 
population thereof, a tissue or cell biopsy, a pathological 
sample, a forensic sample, a chromosome, chromatin, genomic 
DNA, a DNA library and an isolated gene. 

20 As used herein, "subject" means any animal or artificially 
modified animal. Animals include, but are not limited to, 
mice, rats, dogs, guinea pigs, ferrets, rabbits, and primates. 
In the preferred embodiment, the subject is a human. 

25 Embodiments of the In vention 

This invention provides a first method for detecting the 
presence of differential methylation between DNA from a first 
source and the corresponding DNA from a second source, which 
30 method comprises the steps of 

(a) (i) contacting an agent that degrades methylated DNA 
with a DNA sample from the first source, under suitable 
conditions, so as to degrade methylated DNA in the first 
35 sample, and (ii) contacting an agent that degrades 
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unmethylated DNA with a DNA sample from the second 
source, under suitable conditions, so as to degrade 
unmethylated DNA in the second sample; 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the DNA 
strands therein, so as to permit the formation of a 
hybrid DNA duplex comprising a DNA strand from the first 
source and a DNA strand from the second source, should 
both such strands be present; and 

(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 
differential methylation between the DNA from the first 
source and the corresponding DNA from the second source. 

In one embodiment, the first method further comprises the step 
of modifying the DNA of parts (i) and (ii) resulting from step 
(a) with a first and second moiety, respectively, so as to 
prevent, in step (b) , the formation of a DNA duplex consisting 
of DNA strands from the first source or of a DNA duplex 
consisting of DNA strands from the second source. In one 
example, the modification of at least one sample resulting 
from step (c) comprises modifying the DNA in at least one 
sample with a moiety which facilitates the isolation of hybrid 
DNA duplexes formed in step (b) . Such moieties are well known 
in the art and include, for example, biotin. 

In another embodiment, the first method further comprises the 
step of determining the nucleic acid sequence of a hybrid DNA 
duplex whose presence is detected in step (c) . In one 
example, this step further comprises the step of identifying 
the methylated nucleotide residues of one or both strands of 
the hybrid DNA duplex whose sequence is determined. 
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In the first method, the first and second sources of DNA can 
be any suitable sources such as, for example, (i) a cell from 
a first tissue of a subject and a cell from a second tissue of 
that subject, respectively; (ii) a cell from a normal tissue 
and a cell from that tissue in a diseased state, respectively; 
(iii) chromosomes of a chromosome pair; (iv) a DNA library; 
and (v) an isolated gene. In the preferred embodiment, the 
isolated gene is a tumor suppressor gene. 

In another embodiment of the first method, the agent that 
degrades methylated DNA is McrBC. In another embodiment, the 
agent that degrades unmethylated DNA comprises a methylation- 
sensitive restriction endonuclease . In one embodiment, the 
methylation- sensitive restriction endonuclease is selected 
from the group consisting of Hpall, Hhal, Maell, BstUI and 
Acil. In a further embodiment, the agent that degrades 
unmethylated DNA comprises a plurality of methylation- 
sensitive restriction endonucleases . Preferably, the 

plurality of methylation- sensitive restriction endonucleases 
is selected from the group consisting of Hpall, Hhal, Maell, 
BstUI and Acil . 

In the preferred embodiment, the DNA from the first and second 
sources is human DNA. 

This invention also provides a second method for determining 
the presence of a tumor suppressor gene in a DNA sample from 
a tumor cell, which method comprises the steps of 

(a) (i) contacting an agent that degrades unmethylated 
DNA with the DNA sample from the tumor cell, under 
suitable conditions, so as to degrade unmethylated DNA in 
the sample, and (ii) contacting an agent that degrades 
methylated DNA with a DNA sample from a normal cell 
corresponding to the tumor cell, under suitable 
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conditions, so as to degrade methylated DNA in the 
sample; 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the DNA 
strands therein, so as to permit the formation of a 
hybrid DNA duplex comprising a DNA strand from the normal 
cell and a DNA strand from the tumor cell, should both 
such strands be present; 

(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 
differential methylation between the DNA from the normal 
cell and the corresponding DNA from the tumor cell; and 



(d) determining whether the DNA strand from the tumor 
cell in the hybrid DNA duplex detected in step (c) 
comprises a tumor suppressor gene, thereby determining 
the presence of a tumor suppressor gene in the DNA sample 
20 from the tumor cell. 

The various embodiments set forth above with respect to the 
first method of this invention apply mutatis mutandis to the 
second method of this invention. 

25 

This invention will be better understood from the Experimental 
Details that follow. However, one skilled in the art will 
readily appreciate that the specific methods and results 
discussed are merely illustrative of the invention as 
3 0 described more fully in the claims which follow thereafter. 
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Background 

5 Host Defense Hypothesis 

Applicants were the first to purify, characterize, and clone 
a eukaryotic DNA methyltransf erase (Dnmtl; Bestor et al . , 
1988) . Applicants also disrupted the Dnmtl gene (in 

10 collaboration with R. Jaenisch) and demonstrated that cytosine 
methylation is essential for mammalian development (Li et al . , 
1992) . Several of the biological functions of cytosine 
methylation have been deduced from studies of Dnmtl mutant 
mice. The Dnmtl gene was the first gene shown to have sex- 

15 specific promoters and first exons (Mertineit et al., 1998) , 
and deletion of the female-specific promoter and first exon 
was the first pure maternal-ef f ect mutation to be observed in 
a mammal (Howell et al . , 2001). Applicants also found the 
first human genetic disorder to be caused by mutations in a 

20 DNA methyltransf erase gene (Xu et al . , 1999), and were the 
first to solve the crystal structure of a eukaryotic DNA 
methyltransf erase homologue, human DNMT2 (Dong et al., 2001), 
whose function is unknown and is currently under study. 

25 Applicants also put forward the idea that the primary function 
of cytosine methylation is likely to be host defense against 
transposons (Bestor, 1990; Yoder et al . , 1997; Bestor, 2000). 
The host defense hypothesis has come to be supported by a 
large body of evidence and has received increasingly positive 

3 0 regard after a somewhat emotional reception by colleagues 
devoted to the developmental hypothesis. However, until more 
is known of the large-scale patterning of cytosine methylation 
in the genome, there will be continuing controversy as to the 
biological functions of methylation patterns. 

35 
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Th e shape of genomic methylation patterns 

Cytosine methylation is erased by cloning in microorganisms or 
by PCR amplification and information on methylation patterns 
5 is therefore absent from the human genome sequences produced 
by both the public and private sequencing efforts. 

Current methods for the analysis of cytosine methylation are 
ineffective. These methods involve Southern blot analysis 

10 after cleavage with methylation- sensitive restriction 
. endonucleases or PCR across the restriction sites of such 
enzymes, or the sequencing of genomic DNA after deamination by 
sodium bisulfite treatment, which converts all cytosines to 
uracils but does not convert m 5 C so that all remaining 

15 cytosines must have been derived from m 5 C. 

These traditional methods have inherent limitations 
appropriate to their pre-genomics beginnings; they are very 
limited in scope and can be used to test only small regions, 

2 0 they require that the sequences be known in advance and cannot 

be used to extract sequences that are heavily methylated or 
largely unmethylated, and the Southern blot method (which is 
most widely used) can examine only a few sites with narrow 
spacing requirements. It is the CpG density and methylation 

25 status of regions of hundreds of base pairs, rather than 

single CpG sites, that appear to control promoter activity 
(Kass et al., 1997). Examination of single sites, which are 
usually chosen on the basis of convenience, can therefore be 
quite misleading. In addition to these technical issues, it is 

3 0 common to work with DNA from established lines of cultured 

cells rather than tissue DNA. Genomic methylation patterns 
are highly unstable in cultured cells, and in cell lines the 
promoters of tissue- specific genes are frequently methylated 
at positions that are not methylated in non- expressing 
35 tissues. The muscle-specific a-actin gene, for example, is 
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methylated in most mouse and human cell lines but is not 
methylated in mouse brain, liver, or spleen, tissues that do 
not express a-actin (Walsh and Bestor, 1999) . 

5 Although the extant data are fragmentary and often 
contradictory, a few themes do emerge repeatedly. First, 
promoter regions that are heavily methylated in tissues are 
normally silent (examples are imprinted genes and those on the 
inactive X chromosome in females, and promoters that have 
10 undergone de novo methylation in cultured cells or tumors) . 
Second, CpG islands (regions of high G + C content and CpG 
density which span or overlap the 5' ends of most genes) are 
unmethylated in the germ line and in all somatic tissues, 
except when associated with imprinted genes or those subject 
15 to X inactivation. Third, gene silencing usually involves 
methylation of all or nearly all CpG sites in CpG islands that 
are 500-2,000 base pairs in length; methylation of non-CpG 
island sequences does not usually prevent transcription (Kass 
et al., 1997), and the binding of transcription factors can 
20 actually cause demethylation of local CpG sites (Lin et al . , 
200 0) . Fourth, the large majority of genomic m 5 C is within 
transposons, which are abundant (45% of the mammalian genome; 
Smit, 1999) and relatively rich in CpG dinucleotides . More 
than 90% of genomic m 5 C lies with retroposons (Yoder et al . , 
25 1997) , and other repeated sequences such as pericentric 
satellite DNA account for much of the remainder. However, it 
must be kept in mind that the regulatory regions of cellular 
genes represent much less than 1% of the total genome, and 
this small contribution will not be detectable against the 
3 0 large background of heavily methylated transposons and other 
repeated sequences . 

Most genes have unmethylated promoters in both expressing and 
non- expressing tissues, although the transcribed regions tend 
3 5 to be methylated. That is because introns are rich in 
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transposons , 



which are largely methylated. 



This 



is 



illustrated by examination of the HPRT gene in Figure 1. The 



/septTracks .html; please note that all references to "genome 
5 browser" refer to this software) annotates transposon 
distributions, and all long genes can be seen to contain 
multiple transposons . Some, such as VHL, are more than 50% 
transposon, 

10 While the above suggests that the genome is characterized by 
unmethylated single copy cellular sequences embedded in a 
background of methylated transposons, the situation is 
actually more complex. CpG sites in exons can be heavily 
methylated if they lie close to transposons in flanking 

15 introns. Such CpG sites are especially vulnerable to C T 
transition mutations driven by deamination of m 5 C (Magewu and 
Jones, 1994) . CpG islands can be heavily methylated in normal 
cells, as in the case of imprinted genes and those subject to 
X inactivation, and much demethylation (Feinberg and 

2 0 Vogel stein, 1983) and de novo methylation is seen in DNA of 

cancer cells (reviewed by Warnecke and Bestor, 2000) . 
Stochastic and ectopic de novo methylation has been attributed 
a role in human disorders in which there appear to be both 
genetic and epigenetic contributions to phenotype (Petronis et 
25 al., 2000). However, once again the lack of knowledge of the 
large-scale patterning of m 5 C in the genome, and the lack of 
a known method for the extraction of differentially methylated 
sequences, has engendered controversy and slowed progress. 

3 0 Methods and Results 

Applicants have developed methods for the selective cloning of 
the heavily methylated compartment and the unmethylated 
compartment of the genome. The methylated compartment is 
35 resistant to methylation- sensitive restriction endonucleases . 



genome 



browser 



(http: 



//genome . ucsc . edu/goldenPath 
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Applicants use a mixture of 5 such enzymes (Hpall, C*CGG; 
Maell, A*CGT; BstUI , *CG*CG, Hhal , 0*000, and Acil, CC*GC and 
G*CGG; asterisk identifies site of methylation that prevents 
cleavage) . The unmethylated compartment is resistant to 
5 McrBC, an E. coli enzyme complex that binds to sequences of 
the form Rm 5 C- (N) 40 _ S00 -Rm 5 C and degrades all internal sequences 
to small fragments (Stewart and Raleigh, 1998) . Little 
degradation is seen when the two half -sites are more than 500 
base pairs apart. The unmethylated sequences in most CpG 
10 islands are greater than 500 base pairs (Cross et al., 2000) . 
The genome browser flags CpG islands of <4 00 base pairs as 
questionable based on length alone. 

To confirm the reported behavior of the enzymes, applicants 
15 first treated the unmethylated DNA of Schizosaccharomyces 
pombe with McrBC (New England Biolabs) or with the mixture of 
methylation- sensitive restriction endonucleases (referred to 
as "RE treatment"). As shown in Figures 2A-2C, the 
unmethylated DNA was completely resistant to McrBC, but was 
20 degraded to small fragments by RE treatment (lanes 3 and 5; 
bands in lane 5 are from mitochondrial DNA, which is A + T- 
rich and poor in CpG dinucleotides) . When S. pombe DNA was 
methylated at all CpG sites by in vitro treatment with the DNA 
methyltransf erase M.SssI (New England Biolabs) and S-AdoMet, 
25 it was rendered completely resistant to RE (lane 6) treatment 
but became very sensitive to McrBC (lane 4) . The DNA of 
cultured Jurkat cells (a human T cell leukemia cell line) was 
sensitive to McrBC, but markedly less so than artificially 
methylated S. pombe DNA, which has no unmethylated compartment 
3 0 (lanes 4 and 8) . These test data confirm that McrBC and RE 
treatment have the expected effects on methylated and 
unmethylated sequences . 

Even though McrBC has relaxed sequence and spacing 
35 requirements, it was of concern that the McrBC-resistant 
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fraction shown above may have been derived from methylated DNA 
that has a very low CpG density and therefore lacks half sites 
in the configuration required for McrBC digestion. If this 
were so, the McrBC-resistant fraction would also be RE 
5 resistant as a result of methylation or sparse CpG sites. As 
shown in Figure 2B, the McrBC-resistant fraction is very 
sensitive to RE treatment, and Figure 2C shows that 
methylation of CpG sites converts the McrBC resistant fraction 
to McrBC -sensitive. These data confirm that the McrBC library 
10 is composed largely of unmethylated CpG- containing sequence 
tracts . 

Applicants next confirmed that sequence compartments in the 
human genome that are known to be heavily methylated can be 
15 eliminated by McrBC digestion but resist RE treatment. 
Applicants chose the promoter region of a LINE-1 element, 
LI. 3, that has been shown to belong to a family of actively 
transposing LI elements (reviewed by Kazazian and Moran, 
19 98) . These have been found to be heavily methylated in all 

2 0 cell types examined. Also tested was classical satellite 3 

DNA from chromosome 9, which is densely methylated in all 
normal cells but is unmethylated in patients with ICF syndrome 
and in certain tumor types (Xu et al . , 1999). As shown in 
Figure 3, the specific methylated sequences could be almost 
25. completely removed by McrBC treatment. Applicants have 
prepared plasmid libraries of human genomic DNA restricted by 
McrBC or by RE treatment. A size selection is performed as 
indicated in Figure 3 to reduce the already low background, 
and the DNA is. cloned into the Smal site of pBluescript after 

3 0 blunting insert ends with T4 DNA polymerase. These McrBC 

libraries will be depleted in heavily methylated sequences, 
while the RE libraries will be enriched in such sequences. 

These data show that the instant methods allow the selective 
3 5 cloning of both the unmethylated and heavily methylated 
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compartments of the genome. Sequence analysis of McrBC and RE 
libraries permits the first objective large-scale view of the 
methylation landscape of the human genome. These data also 
facilitate the identification of CpG islands by objective 
criteria. The present computational methods must use 
arbitrary thresholds for CpG density and G + C content and 
tend to overestimate CpG island number by a factor or 2. For 
example, distal 21q contains 110 genes, but 234 predicted CpG 
islands. It seems unlikely that gene number was 

underestimated by a factor of 2. Subtractive hybridization of 
the McrBC and RE libraries permits selective extraction of 
sequences that are differentially methylated between normal 
and cancer cells, between tissues of normal individuals and 
those with genetic disorders such as Rett and ICF syndromes, 
and between alleles in the case of imprinted genes. All these 
data can be analyzed on-line by new computational methods and 
added as annotation to the human genome browser in a fully 
automated and almost real-time basis. 

Discussion 

In the mammalian genome, DNA methylation occurs predominantly 
at cytosine residues found in the context of CpG 
dinucleotides . In contrast to genetic alterations, cytosine 
methylation is an epigenetic modification, which is 
potentially reversible and does not alter DNA sequence. As 
the most well characterized mechanism of epigenetic 
regulation, DNA methylation has been implicated in a number of 
biological processes, including genomic imprinting, X- 
inactivation, and silencing of parasitic DNA. Abnormal 
cytosine methylation is thought to contribute to disease 
states, as aberrant genomic methylation patterns have been 
observed in cancer and genetic disorders, such as ICF Syndrome 
and Rett Syndrome, as well as schizophrenia. Demethylation 
also destabilizes the genome and can contribute to the 
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development of cancer. Given the deleterious effects of 
aberrant DNA methylation, it is surprising how little is known 
about normal methylation patterns in the mammalian genome. 
This is due in part to the lack of efficient methods for the 
5 identification of regions of the genome that differ in 
methylation status between cell types. Such a method would be 
very powerful in the identification of tumor suppressors; once 
identified, such new tumor suppressors become targets of 
rational drug design. 

10 

Until recently, the analysis of altered methylation patterns 
has been limited to a small number of predetermined genes. 
Traditional molecular biology techniques, such as Southern 
blotting and polymerase chain reaction (PCR) , are not capable 

15 of analyzing global methylation patterns and they cannot be 
used to isolate sequences on the basis of abnormal methylation 
status. The more recent use of Restriction Landmark Genome 
Scanning (RLGS) and Methylation-Sensitive Representational 
Difference Analysis (MS-RDA) have met with limited success. 

2 0 RLGS is a cumbersome, labor-intensive method in which 
methylation changes are visualized as a dense cluster of 
"spots" on 2 -dimensional gels. This, poor resolution presents 
major difficulties in the identification and isolation of 
genomic loci, making it unsuitable for high- throughput . MS- 

2 5 • RDA is a PCR-based technique that is biased toward short DNA 

fragments and against GC-rich sequences. Novel array-based 
methods have also been developed, but these rely heavily on 
hybridization kinetics. All existing methods are vulnerable 
to the presence of normal cells in the diseased tissue. With 

3 0 the increasing emphasis on the potential role of methylation 

in human diseases, there is an immediate need for an effective 
method for identifying genome-wide changes in DNA methylation 
in human tissue samples. 

35 To meet this need, applicants have developed a novel method 



3NSDOC1D: <W0 03035860 A 1_l_> 



WO 03/035860 



PCT/US02/34159 



-21- 

for identifying alterations in DNA methylation. This 
procedure, which applicants refer to as Methylation 
Subtraction Analysis (MSA) , relies on the enzymatic 
fractionation of the human genome into its methylated and 
5 unmethylated compartments . This fractionation method coupled 
with standard molecular biology techniques facilitates the 
identification and isolation of genomic sequences that are 
unmethylated in normal tissue but have become hypermethylated 
in disease tissue. 

10 

MSA offers several key advantages over other techniques for 
identifying global changes in DNA methylation. Most 
importantly, genomic DNA used in this procedure can be 
obtained directly from normal and disease tissues rather than 

15 cultured cell lines. This point is underscored by the recent 
observation that more than 57% of sequences found to be 
methylated in cultured tumor cells were not methylated in the 
corresponding primary tumors. In some tumors, the error rate 
is 97% (Smiraglia et al . , 2001). Another advantage of MSA is 

2 0 that it is insensitive to contamination of tumor samples by 
normal cells. One of the difficulties in analyzing tumor 
samples, for instance, is that the tumors themselves are often 
a heterogeneous mix of wild- type and cancerous cells. MSA has 
been designed so that methylated sequences from disease cells 

25 will be enzymatically removed from unmethylated genomic 
libraries derived from normal tissue while unmethylated 
sequences will be enzymatically removed from methylated 
libraries derived from disease tissue. This allows for 
accurate identification of genomic loci that display 

30 differential methylation between the normal and disease 
tissues. Finally, the robust and streamlined nature of the 
MSA procedure makes it ideal for high- throughput analyses of 
genome-wide methylation differences. Since the final readout 
is actual DNA sequence, MSA avoids the tedious cloning of 

35 individual candidate loci, which is a major obstacle to high- 
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throughput analysis . 



From a commercialization standpoint, the MSA procedure has 
several research and clinical applications. Several tumor- 
5 suppressor genes have been identified based on the observation 
that they are aberrantly methylated in cancerous cells. This 
number, however, is an underestimation, primarily due to the 
limitations of existing methods for analyzing genome-wide 
methylation changes. To this end, MSA is well suited for the 

10 identification of new tumor -suppressor genes as well genes 
that may contribute to other human disorders. Newly 
identified genes may serve as targets for future therapies 
that focus on targeted demethylation . By a simple 

modification of the fractionation procedure, MSA can also 

15 detect the loss of methylation. This can be used to identify 
new oncogenes that are normally silenced by methylation but 
have become activated during the oncogenic process. The 
proteins encoded by these genes may be potential drug targets 
that drive the development of new treatments. 

20 

While methylation status of a genomic locus does not always 
signify its involvement in a particular disease, the 
methylation patterns themselves undoubtedly have diagnostic 
and prognostic value in the treatment of disease. For 

2 5 example, certain tumor types may have different 

hypermethylation profiles during the course of tumor 
progression. These tumor- specif ic profiles can facilitate 
early cancer diagnosis as well as cancer prognosis. MSA is 
well suited for the large-scale extraction of sequences 

3 0 subject to aberrant methylation in human cancer. Methylation 

analysis is an entirely new route to the identification of 
tumor suppressors. 
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What is claimed is : 

1. A method for detecting the presence of differential 
methylation between DNA from a first source and the 
corresponding DNA from a second source, which method 
comprises the steps of: 

(a) (i) contacting an agent that degrades methylated DNA 
with a DNA sample from the first source, under 
suitable conditions, so as to degrade methylated DNA 
in the first sample, and (ii) contacting an agent 
that degrades unmethylated DNA with a DNA sample 
from the second source, under suitable conditions, 
so as to degrade unmethylated DNA in the second 
sample; 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the 
DNA strands therein, so as to permit the formation 
of a hybrid DNA duplex comprising a DNA strand from 
the first source and a DNA strand from the second 
source, should both such strands be present; and 

(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 
differential methylation between the DNA from the 
first source and the corresponding DNA from the 
second source . 

!. The method of claim 1, further comprising the step of 
modifying the DNA of parts (i) and (ii) resulting from 
step (a) with a first and second moiety, respectively, so 
as to prevent, in step (b) , the formation of a DNA duplex 
consisting of DNA strands from the first source or of a 
DNA duplex consisting of DNA strands from the second 
source . 
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3. The method of claim 2, wherein the modification of at 
least one sample resulting from step (c) comprises 
modifying the DNA in at least one sample with a moiety 
which facilitates the isolation of hybrid DNA duplexes 
formed in step (b) . 

4. The method of claim 3, wherein the moiety is biotin. 

5. The method of claim 1, further comprising the step of 
determining the nucleic acid sequence of a hybrid DNA 
duplex whose presence is detected in step (c) . 

6. The method of claim 5, further comprising the step of 
identifying the methylated nucleotide residues of one or 
both strands of the hybrid DNA duplex whose sequence is 
determined. 

7. The method of claim 1, wherein the first and second 
sources of DNA are a cell from a first tissue of a 
subject and a cell from a second tissue of that subject, 
respectively . 

8. The method of claim 1, wherein the first and second 
sources of DNA are a cell from a normal tissue and a cell 
from that tissue in a diseased state, respectively. 

9. The method of claim 1, wherein the first and second 
sources of DNA are both chromosomes of a chromosome pair. 

10. The method of claim 1, wherein each of the DNA samples 
from the first and second sources is a DNA library. 

11. The method of claim 1, wherein each of the DNA samples 
from the first and second sources is an isolated gene. 
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12. The method of claim 11, wherein the isolated gene is a 
tumor suppressor gene. 

13. The method of claim 1, wherein the agent that degrades 
methylated DNA is McrBC. 

14. The method of claim 1, wherein the agent that degrades 
unmethylated DNA comprises a methylat ion- sensitive 
restriction endonuclease . 

15. The method of claim 14, wherein the agent comprises a 
methylation- sensitive restriction endonuclease selected 
from the group consisting of Hpall, Hhal, Maell, BstUI 
and Acil. 



16. The method of claim 1, wherein the agent that degrades 
unmethylated DNA comprises a plurality of methylation- 
sensitive restriction endonuc 1 eases . 



17. 



The method of claim 16, wherein the agent comprises a 
plurality of methylation-sensitive restriction 
endonucleases selected from the group consisting of 
Hpall, Hhal, Maell, BstUI and Acil. 

18. The method of claim l, wherein the DNA from the first and 
second sources is human DNA. 



19 



A method for determining the presence of a tumor 
suppressor gene in a DNA sample from a tumor cell, which 
method comprises the steps of: 

(a) (i) contacting an agent that degrades unmethylated 
DNA with the DNA sample from the tumor cell, under 
suitable conditions, so as to degrade unmethylated 
DNA in the sample, and (ii) contacting an agent that 
degrades methylated DNA with a DNA sample from a 
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normal cell corresponding to the tumor cell, under 
suitable conditions, so as to degrade methylated DNA 
in the sample ; 

(b) contacting the resulting samples with each other 
under conditions permitting reannealing between the 
DNA strands therein, so as to permit the formation 
of a hybrid DNA duplex comprising a DNA strand from 
the normal cell and a DNA strand from the tumor 
cell, should both such strands be present; 

(c) detecting the formation of any such hybrid DNA 
duplex, such formation indicating the presence of 
differential methylation between the DNA from the 
normal cell and the corresponding DNA from the tumor 
cell; and 

(d) determining whether the DNA strand from the tumor 
cell in the hybrid DNA duplex detected in step (c) 
comprises a tumor suppressor gene, thereby 
determining the presence of a tumor suppressor gene 
in the DNA sample from the tumor cell. 

20. The method of claim 19, further comprising the step of 
modifying the DNA of parts (i) and (ii) resulting from 
step (a) with a first and second moiety, respectively, so 
as to prevent, in step (b) , the formation of a DNA duplex 
consisting of DNA strands from the normal cell or of a 
DNA duplex consisting of DNA strands from the tumor cell. 

21. The method of claim 20, wherein the modification of at 
least one sample resulting from step (c) comprises 
modifying the DNA in at least one sample with a moiety 
which facilitates the isolation of hybrid DNA duplexes 
formed in step (b) . 

22. The method of claim 21, wherein the moiety is biotin. 
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23. The method of claim 19, further comprising the step of 
determining the nucleic acid sequence of a hybrid DNA 
duplex whose presence is detected in step (c) . 

24. The method of claim 23, further comprising the step of 
identifying the methylated nucleotide residues of one or 
both strands of the hybrid DNA duplex whose sequence is 
determined. 

25. The method of claim 19, wherein the agent that degrades 
methylated DNA is McrBC. 

26. The method of claim 19, wherein the agent that degrades 
unmethylated DNA comprises a methyl at ion -sensitive 
restriction endonuclease. 

27. The method of claim 26, wherein the agent comprises a 
methylat ion- sensitive restriction endonuclease selected 
from the group consisting of Hpall, Hhal , Maell, BstUI 
and Acil . 

28. The method of claim 19, wherein the tumor cell is a human 
cell, and the normal cell corresponding to the tumor cell 
is a human cell. 
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